We hosted a deepdive on Experimentation with Gautham Krishnan (Product leader, Disney+Hotstar) & Pramod N (Head of product & data science, Rapido), hosted by Saket Toshniwal (Sr. Director, MoEngage), where we covered the nuances how to run an experiment, end to end.
This was a super actionable and interesting conversation, and this document is a summary of the conversation.
Before we begin, a little about our guests:
Gautham Krishnan (Product leader, Disney+Hotstar)
Gautham is currently a product leader Disney+Hotstar. He has previously worked in product leadership roles in companies like Gameskraft, Snapdeal and Honestbee. Apart from this, heâs also a part of the advisory council for ISBâs PGP programme.
Pramod N (VP Product & Data Science, Rapido)
Pramod is currently the VP Head of Product & Data Science at Rapido, where heâs responsible fo product-led growth and bringing in efficiencies to growth across customer, captain and marketplace levers. Before this, he was the Lead Consultant at Thoughtworks.
@Saket Toshniwal (Sr. Director, MoEngage)
Saket has over 13 years of experience of working in product & growth roles. Heâs currently a Senior Director at MoEngage. He's been a GrowthX member since the last 6 months, and a lot of you must already have interacted with him at different events.
In this document, we're going the cover the questions asked, and a summary of answers given by panelists.
There are a couple of factors when it comes to a litmus test for experimentation:
Experiments are not standalone events; they depend on an ecosystem where teams believe in data-driven decision-making. Without alignment, even the best-designed experiments fail to create impact. Stakeholders may question the validity of data or reject findings that contradict their beliefs.
The cultural resistance that can block experimentation. This resistance stems from a lack of alignment or a reliance on intuition rather than data.
Key steps for alignment here:
Key steps on how to overcome resistance:
Experimentation provides clarity in situations where the outcome is unpredictable. This could include user behaviour, market reactions, or operational changes. There are 2 types of problem statements - Tier 1 problems and Tier 2 problems
These are irreversible, high-stakes changes. Gautam explained:
âOnce you increase prices, you cannot go back. There might be a big consumer backlash.â
Examples: Overhauling the payment system, launching a new pricing strategy, or changing core app navigation.
Experimentation ensures risks are mitigated before full-scale deployment. For instance, if a new payment system is tested, the company can understand user drop-off rates and failure points before global rollout.
These are low-risk, reversible changes that donât require extensive validation.
âThese are revolving doors where... you can pull back the decisions youâve taken.â
Examples: Introducing a new banner layout, tweaking button colors, or making minor adjustments to existing flows.
These can be rolled out directly with mechanisms for rollback if needed.
Not all problems merit experimentation. If the cost (time, resources, risk) of running an experiment outweighs the value of the insights, direct action may be a better choice.
If itâs more expensive to run an experiment than to just do it, then donât bother running an experiment.
Examples:
Leaders must decide when the learning potential of an experiment outweighs its execution costs. For instance:
Hotstar was revamping the entire UI?UX for it's app. This revamp was not just about cosmetic changes but fundamentally altered how users interacted with the platform, impacting critical metrics and user behaviours.
Why was it difficult to experiment:
The scale and scope of the changes made traditional A/B testing impractical. Gautam explained:
The changes were considered a tier one decision because they had broad implications for user engagement and content consumption, making it impossible to roll back completely once implemented.âWhen we revamped the entire app UI/UX, all the data changedâthe clickstream, user navigation patterns, and how content was consumed. These changes are so significant that they couldnât be A/B tested effectively.â
How they approached it:
Instead of traditional A/B testing, they employed a graceful rollout strategy:
Outcome considerations:
This strategy ensured they could capture user feedback and tweak elements before a full-scale rollout. However, such decisions required alignment across teams, as the impact on user behaviour and business metrics was substantial.
Pramod discussed the major decision at Rapido to shift from a commission-based model to a subscription-based model. This was another tier one decision, as it involved a fundamental change in the companyâs business structure with long-term implications.
Why was it challenging:
âThis is one of those tier one decisions. It has insane risk; it changes you for the next 50 years.â
A wrong move could alienate key stakeholders (e.g., drivers and users), reduce trust, and negatively impact revenue streams.
How did they approach it:
Outcome Considerations:
These smaller experiments provided critical data that helped mitigate risks and informed the final decision. While the shift itself wasnât directly tested at scale, the insights gathered minimised uncertainties.
A well-crafted hypothesis one-pager is important to guide an experiment effectively. It ensures clarity, aligns stakeholders, and sets the stage for meaningful insights. This is the first and the most important step while conducting any experiment.
There is no strict rule that experimentation requires a minimum number of Monthly Active Users (MAUs) or Daily Active Users (DAUs). Instead, it depends on what you are trying to validate and the type of risks youâre addressing. Pramod explained:
âFor usability risks or user-related risksâlike whether users will adopt a new featureâyou can test even without a large active user base. You donât even need tech to test these risks.â
âYou can test without a large DAU or MAU. If you understand the principles of experimentation, you can start even with just 100 users.â
The first step is to provide the analytics team with a clear understanding of why the dashboard is needed and what it aims to achieve.
Gautamâs Approach:
âThe definition of the metric changes from team to team... The first step is ensuring everyone is aligned on what the metric means and how itâs calculated.â
Clearly define the North Star metric (primary metric of success) and guardrail metrics (to track unintended side effects).
Gautamâs Insight:
âPush notifications often focus on CTR as the metric, but thatâs a leading indicator. The real North Star metric should be the transaction completion rate or the time spent engaging with content.â
Analytics dashboards should account for homogeneity to ensure valid comparisons between control and test groups.
Gautamâs Point:
âOne of the main reasons experiments fail is because the population isnât homogeneous... The control and test groups must be representative of the wider user base.â
What to Include in the Brief:
Include a data dictionary that lists all events, variables, and metrics used in the dashboard.
Gautamâs Take:
âA data dictionary ensures that variables or events are consistently understood across teams, avoiding confusion about whatâs being tracked.â
Example Brief Section:
Include definitions for key events like:
Stakeholders are more likely to support an experiment if they clearly understand its purpose, expected outcomes, and how success will be measured. Pre-alignment ensures thereâs no confusion or conflict once the experiment process has begun.
Gautham's perspective:
If there is no alignment, and if there is no culture of experimentation, thereâs no point in running that experiment.â
Experiments should only proceed when all stakeholders agree on the problem, hypothesis, and metrics. Misaligned expectations often lead to resistance or dismissal of the results.
How to Do It:
Example:
Before launching a pricing experiment, the team aligns on the primary objective (e.g., improving revenue), success metric (e.g., revenue per user), and guardrails (e.g., churn rate shouldnât increase beyond 5%).
A lack of experimentation culture can lead to skepticism, resistance, or outright rejection of findings. Experiments can be perceived as time-wasting or threatening, especially if they challenge long-standing beliefs.
Gautamâs Insight:
âWe need to be dispassionate about the outcome. Itâs not about whose idea it is, but whether the idea works.â
He emphasised creating a culture where data, not egos, drive decisions. This requires normalising failure as part of the learning process.
Pramodâs Insight:
âThe biggest barrier is often people who think they know everything... It requires somebody to accept that there is a scientific way of learning things.â
At Rapido, he worked to establish a culture where experiments were seen as tools for discovery rather than tools to prove someone wrong.
How to Do It:
Example:
At Disney+ Hotstar, Gautam emphasized the importance of framing results objectively, ensuring that no individual or team was blamed for an unsuccessful experiment.
Stakeholders are more likely to support an experiment if they see how it contributes to the organizationâs strategic objectives, such as revenue growth, user retention, or operational efficiency.
Gautamâs Insight:
âStakeholders should see that experiments contribute to larger business goals, such as retention, revenue, or engagement.â
Experiments must be framed in the context of how they advance these goals.
Pramodâs Insight:
âIf youâre dealing with significant unknowns, the cost of inaction is often higher than the cost of running an experiment.â
He used this argument to justify experimentation for high-stakes decisions, such as transitioning Rapido to a subscription model.
How to Do It:
Example:
For a new subscription plan, frame the experiment as a way to increase driver retention and stabilize revenue, directly linking it to organizational growth goals.
Before interpreting the results, ensure that the sampling process was accurate and that the test and control groups were homogeneous. A failure in sampling can invalidate the conclusions.
Gautamâs Insight:
âAfter the experiment, we still check for homogeneity to ensure the control and test groups are comparable.â
How to do it:
Example:
For a new subscription feature, verify that control and test groups had similar engagement levels and subscription behaviour before the intervention.
Leading metrics (e.g., click-through rates, interaction frequency) provide an early indication of whether the intervention drove the expected behaviour.
Pramodâs Insight:
âBefore diving into lagging metrics, check if the leading metrics moved as expected. If they didnât, the experiment might not have had the intended impact.â
What to Look For:
Example:
If a push notification experiment aimed to increase engagement, leading metrics like click-through rate or notification open rate should show positive changes.
Lagging metrics, such as retention, revenue, or lifetime value (LTV), provide a more comprehensive picture of the experimentâs long-term impact.
Gautamâs Insight:
âClick-through rates are a leading metric, but the real goal is whether the North Star metricâlike transaction completion or watch timeâincreased.â
What to Look For:
Example:
For a subscription experiment, measure whether users who received a personalized offer showed a higher retention rate over 30 days compared to the control group.
Guardrail metrics help you identify unintended consequences of the intervention.
Gautamâs Insight:
âWhen optimising content, we realized that increasing click-through rates on tiles could cannabalise overall engagement by shifting attention away from long-format shows.â
What to Look For:
Example:
For a UI revamp, guardrail metrics like app crash rates, session lengths, or opt-out rates for notifications can highlight negative impacts.
Statistical significance ensures that observed differences are unlikely to have occurred by chance.
Pramodâs Insight:
âHuman behaviour is unpredictable. Before concluding, check whether the changes are significant and not just within the range of natural variability.â
How to Do It:
Example:
For an experiment showing a 2% increase in retention, ensure that this change exceeds the confidence interval and is statistically significant.
Unexpected results may indicate user segmentation issues, technical errors, or other confounding variables.
Pramodâs Example:
âIf fulfillment improved by 1.5%, the next question is whether that change is meaningful or just a result of a sampling bias.â
How to Investigate:
Example:
If a pricing experiment led to higher revenue in one region but lower revenue elsewhere, investigate regional preferences or economic factors.
Hypotheses provide the framework for evaluating success and learning from the experiment.
Gautamâs Insight:
âExperiments should test hypotheses directly. If the hypothesis was poorly formed, you might get results that donât align with your goals.â
How to Do It:
Example:
If a hypothesis predicted that reducing onboarding steps would lower drop-off rates by 20%, check if the drop-off decreased by at least that amount.
Holdouts and reverse A/B testing provide additional validation by comparing users exposed to the change with those who were not, even after the experiment ends.
Pramodâs Insight:
âReverse A/B testing helps you understand whether users revert to old behaviors when the feature is removed.â
How to Do It:
Example:
For a loyalty program, check whether users who lose access to benefits show a drop in engagement compared to those still enrolled.
Proper documentation ensures that findings are accessible for future experiments and decision-making.
Gautamâs Insight:
âDocumenting learnings helps improve the roadmap and ensures that everyoneâproduct, analytics, designâbenefits from the experiment. We socialize results widely so that everyoneâengineering, design, marketingâunderstands the learnings and uses them to make better decisions.â
What to Document:
Example:
For a UI experiment, document how changes improved the North Star metric and note any technical challenges faced during implementation.
Brand focused courses
Great brands aren't built on clicks. They're built on trust. Craft narratives that resonate, campaigns that stand out, and brands that last.
All courses
Master every lever of growth â from acquisition to retention, data to events. Pick a course, go deep, and apply it to your business right away.
Explore foundations by GrowthX
Built by Leaders From Amazon, CRED, Zepto, Hindustan Unilever, Flipkart, paytm & more
Crack a new job or a promotion with the Career Centre
Designed for mid-senior & leadership roles across growth, product, marketing, strategy & business
Learning Resources
Browse 500+ case studies, articles & resources the learning resources that you won't find on the internet.
Patienceâyouâre about to be impressed.